Search CORE

15 research outputs found

SNIF TOOL - Sniffing for Patterns in Continuous Streams

Author: MUKHERJI ABHISHEK
Publication venue: Digital WPI
Publication date: 11/02/2008
Field of study

Recent technological advances in sensor networks and mobile devices give rise to new challenges in processing of live streams. In particular, time-series sequence matching, namely, the similarity matching of live streams against a set of predefined pattern sequence queries, is an important technology for a broad range of domains that include monitoring the spread of hazardous waste and administering network traffic. In this thesis, I use the time critical application of monitoring of fire growth in an intelligent building as my motivating example. Various measures and algorithms have been established in the current literature for similarity of static time-series data. Matching continuous data poses the following new challenges: 1) fluctuations in stream characteristics, 2) real-time requirements of the application, 3) limited system resources, and, 4) noisy data. Thus the matching techniques proposed for static time-series are mostly not applicable for live stream matching. In this thesis, I propose a new generic framework, henceforth referred to as the n-Snippet Indices Framework (in short, SNIF), for discovering the similarity between a live stream and pattern sequences. The framework is composed of two key phases: (1.) Off-line preprocessing phase: where the pattern sequences are processed offline and stored into an approximate 2-level index structure; and (2.) On-line live stream matching phase: streaming time-series (or the live stream) is on-the-fly matched against the indexed pattern sequences. I introduce the concept of n-Snippets for numeric data as the unit for matching. The insight is to match small snippets of the live stream against prefixes of the patterns and maintain them in succession. Longer the pattern prefixes identified to be similar to the live stream, better the confirmation of the match. Thus, the live stream matching is performed in two levels of matching: bag matching for matching snippets and order checking for maintaining the lengths of the match. I propose four variations of matching algorithms that allow the user the capability to choose between the two conflicting characteristics of result accuracy versus response time. The effectiveness of SNIF to detect patterns has been thoroughly tested through extensive experimental evaluations using the continuous query engine CAPE as platform. The evaluations made use of real datasets from multiple domains, including fire monitoring, chlorine monitoring and sensor networks. Moreover, SNIF is demonstrated to be tolerant to noisy datasets

DigitalCommons@WPI

Pattern Mining and Sense-Making Support for Enhancing the User Experience

Author: Mukherji Abhishek
Publication venue: Digital WPI
Publication date: 07/12/2018
Field of study

While data mining techniques such as frequent itemset and sequence mining are well established as powerful pattern discovery tools in domains from science, medicine to business, a detriment is the lack of support for interactive exploration of high numbers of patterns generated with diverse parameter settings and the relationships among the mined patterns. To enhance the user experience, real-time query turnaround times and improved support for interactive mining are desired. There is also an increasing interest in applying data mining solutions for mobile data. Patterns mined over mobile data may enable context-aware applications ranging from automating frequently repeated tasks to providing personalized recommendations. Overall, this dissertation addresses three problems that limit the utility of data mining, namely, (a.) lack of interactive exploration tools for mined patterns, (b.) insufficient support for mining localized patterns, and (c.) high computational mining requirements prohibiting mining of patterns on smaller compute units such as a smartphone. This dissertation develops interactive frameworks for the guided exploration of mined patterns and their relationships. Contributions include the PARAS pre- processing and indexing framework; enabling analysts to gain key insights into rule relationships in a parameter space view due to the compact storage of rules that enables query-time reconstruction of complete rulesets. Contributions also include the visual rule exploration framework FIRE that presents an interactive dual view of the parameter space and the rule space, that together enable enhanced sense-making of rule relationships. This dissertation also supports the online mining of localized association rules computed on data subsets by selectively deploying alternative execution strategies that leverage multidimensional itemset-based data partitioning index. Finally, we designed OLAPH, an on-device context-aware service that learns phone usage patterns over mobile context data such as app usage, location, call and SMS logs to provide device intelligence. Concepts introduced for modeling mobile data as sequences include compressing context logs to intervaled context events, adding generalized time features, and identifying meaningful sequences via filter expressions

DigitalCommons@WPI

MACHINE LEARNING FRAMEWORK FOR PRIORITIZING LOCATION MEASUREMENTS OF MULTIPLE DEVICES

Author: Bhattacharyya Abhishek
Mukherji Abhishek
Pandey Santosh
Raghuram Vinay
Silverman Matt
Tran Huy
Zhang Xu
Publication venue: Technical Disclosure Commons
Publication date: 21/12/2018
Field of study

Presented herein is a framework for prioritizing location measurements of multiple client devices. In particular, rather than using a round robin scheduling approach, the techniques presented herein utilize a machine learning block (e.g., random forests) to predict a score for each client device, along with a score-based scheduler

Technical Disclosure Common

Welcome message from the General Chairs

Author: ALSHURAFA Nabil I.
MISRA Archan
MUKHERJI Abhishek
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2019
Field of study

Institutional Knowledge at Singapore Management University

Did you take a break today? Detecting playing foosball using your smartwatch

Author: MISRA Archan
MUKHERJI Abhishek
RACHURI Kiran K.
SEN Sougata
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2016
Field of study

Ministry of Education, Singapore under its Academic Research Funding Tier 2; National Research Foundation (NRF) Singapore under IDM Futures Funding Initiativ

Institutional Knowledge at Singapore Management University

Paras: interactive parameter space exploration for association rule mining

Author: Abhishek Mukherji
Carolina Ruiz
Christopher R Botaish
Elke A Rundensteiner
Jason Whitehouse
Matthew O Ward
Xika Lin
Publication venue
Publication date: 01/01/2013
Field of study

ABSTRACT We demonstrate our PARAS technology for supporting interactive association mining at near real-time speeds. Key technical innovations of PARAS, in particular, stable region abstractions and rule redundancy management supporting novel parameter space-centric exploratory queries will be showcased. The audience will be able to interactively explore the parameter space view of rules. They will experience near real-time speeds achieved by PARAS for operations, such as comparing rulesets mined using different parameter values, that would otherwise take hours of computation and much manual investigation. Overall, we will demonstrate that the PARAS system provides a rich experience to data analysts while significantly reducing the trial-and-error interactions

CiteSeerX

Adding Intelligence to Your Mobile Device via On-Device Sequential Pattern Mining

Author: Abhishek Mukherji
Evan Welbourne
Vijay Srinivasan
Publication venue
Publication date: 02/04/2020
Field of study

Abstract The next revolution in mobile user experience is predicted to be a smart device that can adapt to its user's lifestyle and surroundings to become a proactive personal assistant. We introduce the idea of Mobile Sequence Mining (MSM) engine that automatically learns phone usage sequential patterns over the rich context data captured within the device. The learned patterns can then enable variety of applications including proactive assistance for a variety of use cases. Unlike existing cloud-based intelligence services (e.g., GoogleNow) that rely on internet access and may compromise privacy, MSM provides device intelligence by leveraging mined longitudinal patterns while preserving privacy via on-device mining. MSM is generic and can provide sequential patterns and predictions over multiple data streams, also allowing individual mobile applications to stream their own private data to mine sequential patterns. In our preliminary tests by deploying MSM on 3 user devices, it mines frequent sequential patterns within 8 minutes over 7-53 days of longitudinal user context data including location, app usage and call logs spanning 137-312 unique contexts. We conclude the paper by enumerating future research challenges for mobile sequence mining

CiteSeerX

Adding Intelligence to Your Mobile Device via On-Device Sequential Pattern Mining

Author: Abhishek Mukherji
Evan Welbourne
Vijay Srinivasan
Publication venue
Publication date: 02/04/2020
Field of study

CiteSeerX